# Verify Hardware Availability
import tensorflow as tf
if tf.test.gpu_device_name():
print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))
else:
print("Please install GPU version of TF")
if tf.config.list_physical_devices('TPU'):
print('TPU is available')
else:
print('TPU is not available')
print("Please install GPU version of TF")
if tf.config.list_physical_devices('TPU'):
print('TPU is available')
else:
print('TPU is not available')
Please install GPU version of TF TPU is available TPU is available
In this exercise, we will adapt our image classification task to an object detection task. Object detection involves not only classifying objects within an image but also localizing them with bounding boxes.
Note: Due to the limited computational resources available, we'll be using a smaller subset of the Pascal VOC 2007 dataset and a lightweight object detection model. This might result in lower accuracy, but the focus of this exercise is on understanding the concepts and workflow of object detection.
%pip install tensorflow tensorflow-hub tensorflow-datasets matplotlib
Requirement already satisfied: tensorflow in /usr/local/lib/python3.10/dist-packages (2.15.0) Requirement already satisfied: tensorflow-hub in /usr/local/lib/python3.10/dist-packages (0.16.1) Requirement already satisfied: tensorflow-datasets in /usr/local/lib/python3.10/dist-packages (4.9.6) Requirement already satisfied: matplotlib in /usr/local/lib/python3.10/dist-packages (3.7.1) Requirement already satisfied: absl-py>=1.0.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (1.4.0) Requirement already satisfied: astunparse>=1.6.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (1.6.3) Requirement already satisfied: flatbuffers>=23.5.26 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (24.3.25) Requirement already satisfied: gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (0.6.0) Requirement already satisfied: google-pasta>=0.1.1 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (0.2.0) Requirement already satisfied: h5py>=2.9.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (3.12.1) Requirement already satisfied: libclang>=13.0.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (18.1.1) Requirement already satisfied: ml-dtypes~=0.2.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (0.2.0) Requirement already satisfied: numpy<2.0.0,>=1.23.5 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (1.26.4) Requirement already satisfied: opt-einsum>=2.3.2 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (3.4.0) Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from tensorflow) (24.1) Requirement already satisfied: protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (4.25.5) Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from tensorflow) (75.1.0) Requirement already satisfied: six>=1.12.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (1.16.0) Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (2.5.0) Requirement already satisfied: typing-extensions>=3.6.6 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (4.12.2) Requirement already satisfied: wrapt<1.15,>=1.11.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (1.14.1) Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.23.1 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (0.37.1) Requirement already satisfied: grpcio<2.0,>=1.24.3 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (1.67.0) Requirement already satisfied: tensorboard<2.16,>=2.15 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (2.15.2) Requirement already satisfied: tensorflow-estimator<2.16,>=2.15.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (2.15.0) Requirement already satisfied: keras<2.16,>=2.15.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow) (2.15.0) Requirement already satisfied: tf-keras>=2.14.1 in /usr/local/lib/python3.10/dist-packages (from tensorflow-hub) (2.15.1) Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from tensorflow-datasets) (8.1.7) Requirement already satisfied: dm-tree in /usr/local/lib/python3.10/dist-packages (from tensorflow-datasets) (0.1.8) Requirement already satisfied: immutabledict in /usr/local/lib/python3.10/dist-packages (from tensorflow-datasets) (4.2.0) Requirement already satisfied: promise in /usr/local/lib/python3.10/dist-packages (from tensorflow-datasets) (2.3) Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from tensorflow-datasets) (5.9.5) Requirement already satisfied: pyarrow in /usr/local/lib/python3.10/dist-packages (from tensorflow-datasets) (17.0.0) Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow-datasets) (2.32.3) Requirement already satisfied: simple-parsing in /usr/local/lib/python3.10/dist-packages (from tensorflow-datasets) (0.1.6) Requirement already satisfied: tensorflow-metadata in /usr/local/lib/python3.10/dist-packages (from tensorflow-datasets) (1.13.1) Requirement already satisfied: toml in /usr/local/lib/python3.10/dist-packages (from tensorflow-datasets) (0.10.2) Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from tensorflow-datasets) (4.66.5) Requirement already satisfied: array-record>=0.5.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow-datasets) (0.5.1) Requirement already satisfied: etils>=1.6.0 in /usr/local/lib/python3.10/dist-packages (from etils[enp,epath,epy,etree]>=1.6.0; python_version < "3.11"->tensorflow-datasets) (1.10.0) Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib) (1.3.0) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib) (4.54.1) Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib) (1.4.7) Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib) (11.0.0) Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib) (3.2.0) Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.10/dist-packages (from matplotlib) (2.9.0.post0) Requirement already satisfied: wheel<1.0,>=0.23.0 in /usr/local/lib/python3.10/dist-packages (from astunparse>=1.6.0->tensorflow) (0.44.0) Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from etils[enp,epath,epy,etree]>=1.6.0; python_version < "3.11"->tensorflow-datasets) (2024.10.0) Requirement already satisfied: importlib_resources in /usr/local/lib/python3.10/dist-packages (from etils[enp,epath,epy,etree]>=1.6.0; python_version < "3.11"->tensorflow-datasets) (6.4.5) Requirement already satisfied: zipp in /usr/local/lib/python3.10/dist-packages (from etils[enp,epath,epy,etree]>=1.6.0; python_version < "3.11"->tensorflow-datasets) (3.20.2) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->tensorflow-datasets) (3.4.0) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->tensorflow-datasets) (3.10) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->tensorflow-datasets) (2.2.3) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->tensorflow-datasets) (2024.8.30) Requirement already satisfied: google-auth<3,>=1.6.3 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow) (2.27.0) Requirement already satisfied: google-auth-oauthlib<2,>=0.5 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow) (1.2.1) Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow) (3.7) Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow) (0.7.2) Requirement already satisfied: werkzeug>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from tensorboard<2.16,>=2.15->tensorflow) (3.0.6) Requirement already satisfied: docstring-parser<1.0,>=0.15 in /usr/local/lib/python3.10/dist-packages (from simple-parsing->tensorflow-datasets) (0.16) Requirement already satisfied: googleapis-common-protos<2,>=1.52.0 in /usr/local/lib/python3.10/dist-packages (from tensorflow-metadata->tensorflow-datasets) (1.65.0) Requirement already satisfied: cachetools<6.0,>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow) (5.5.0) Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.10/dist-packages (from google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow) (0.4.1) Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.10/dist-packages (from google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow) (4.9) Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.10/dist-packages (from google-auth-oauthlib<2,>=0.5->tensorboard<2.16,>=2.15->tensorflow) (2.0.0) Requirement already satisfied: MarkupSafe>=2.1.1 in /usr/local/lib/python3.10/dist-packages (from werkzeug>=1.0.1->tensorboard<2.16,>=2.15->tensorflow) (3.0.2) Requirement already satisfied: pyasn1<0.7.0,>=0.4.6 in /usr/local/lib/python3.10/dist-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow) (0.6.1) Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.10/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<2,>=0.5->tensorboard<2.16,>=2.15->tensorflow) (3.2.2)
# Import necessary libraries
import tensorflow as tf
import tensorflow_hub as hub
import tensorflow_datasets as tfds
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import cv2
from PIL import Image
import requests
from io import BytesIO
print("TensorFlow version:", tf.__version__)
print("TensorFlow Hub version:", hub.__version__)
TensorFlow version: 2.15.0 TensorFlow Hub version: 0.16.1
We will use the VOC2007 dataset, which contains images with annotations for object detection. For demonstration purposes, we will load a small subset of the dataset using TensorFlow Datasets.
with_info=True returns additional information about the dataset, which we'll use later.
The PASCAL VOC2007 (Visual Object Classes) dataset is a widely used benchmark dataset for object recognition tasks in computer vision. It comprises a collection of images annotated with bounding boxes and class labels for objects belonging to 20 different categories.
Key characteristics of the VOC2007 dataset:
Common use cases of the VOC2007 dataset:
import tensorflow_datasets as tfds
import matplotlib.pyplot as plt
# Load a smaller dataset
def load_data(split='train'):
dataset, info = tfds.load('voc/2007', split=split, shuffle_files=True, with_info=True)
return dataset, info
# Load the train dataset and extract info
train_dataset, train_info = load_data('train[:10%]')
# Load the validation dataset
validation_dataset, validation_info = load_data('validation[:10%]')
# Get class names
class_names = train_info.features["objects"]["label"].names # Changed from ds_info to train_info
print("Class names:", class_names)
Class names: ['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor']
def display_examples(dataset, n=3): # Display 'n' examples by default
for example in dataset.take(n):
image = example["image"]
plt.figure(figsize=(5, 5))
plt.imshow(image)
plt.title("Image with Ground Truth Bounding Boxes")
# Draw ground truth boxes
for box in example["objects"]["bbox"]:
ymin, xmin, ymax, xmax = box
rect = patches.Rectangle((xmin * image.shape[1], ymin * image.shape[0]),
(xmax - xmin) * image.shape[1], (ymax - ymin) * image.shape[0],
linewidth=1, edgecolor='g', facecolor='none')
plt.gca().add_patch(rect)
plt.show()
display_examples(train_dataset)
We got the list of all class names in the VOC2007 dataset and select images containing our target classes (e.g., person, car, bird).
class_names provides the list of class names.target_class_ids contains the IDs of the classes we are interested in.find_images_with_classes is a function to find images containing our target classes.Loading the model early (right after dataset loading):
Pros: Model is immediately available; clear separation of setup and processing. Cons: Potentially inefficient if data prep is extensive or fails.
Loading the model after data preparation:
Pros: More efficient resource use; avoids unnecessary loading if data prep fails. Cons: Model isn't available for any data prep steps that might need it.
In our specific case, loading the model after data preparation is slightly better because:
Our data prep doesn't need the model. It's more resource-efficient. It follows a logical flow: prepare data, load tools, process data. It avoids unnecessary model loading if data prep fails.
However, the difference is minimal in this small-scale example. For beginners, loading major components upfront can sometimes be clearer and easier to follow. As a best practice, aim to load your model as close as possible to where you'll use it, ensuring all necessary data and resources are ready first.
#Load a pre-trained object detection model
detector = hub.load("https://tfhub.dev/tensorflow/ssd_mobilenet_v2/2")
Let's break this down:
Advantages of this approach:
Concise and readable Directly loads the model without additional wrapper functions TensorFlow Hub handles caching, so subsequent loads will be faster
We will use the pre-trained model to detect objects in our selected images and display them with bounding boxes.
detector is the pre-trained object detection model.detect_objects is a function that uses the model to detect objects in an image.display_detections is a function to display the detected objects with bounding boxes.The display_image_with_boxes function takes an image, bounding boxes, and class names, then displays the image with bounding boxes drawn around detected objects.
plot_detections: This function visualizes the detected objects by drawing bounding boxes and labels on the image.
process_uploaded_image which processes an uploaded image for object detection. The function takes the raw image data as input, preprocesses the image, runs the object detection model, and then plots and prints the detected objects.
# Run Detector and Visualize
def run_detector_and_visualize(example):
image = example["image"]
ground_truth_boxes = example["objects"]["bbox"]
# Preprocess and run detection
converted_img = tf.image.convert_image_dtype(image, tf.uint8)[tf.newaxis, ...]
result = detector(converted_img)
result = {key: value.numpy() for key, value in result.items()}
# Visualize results (with ground truth for comparison)
plt.figure(figsize=(10, 7))
plt.imshow(image)
# Ground truth boxes (VOC format is [xmin, ymin, xmax, ymax])
for box in ground_truth_boxes:
ymin, xmin, ymax, xmax = box
rect = patches.Rectangle((xmin * image.shape[1], ymin * image.shape[0]),
(xmax - xmin) * image.shape[1], (ymax - ymin) * image.shape[0],
linewidth=1, edgecolor='g', facecolor='none', label='Ground Truth')
plt.gca().add_patch(rect)
# Predicted boxes
for i, score in enumerate(result['detection_scores'][0]):
if score > 0.5: # Confidence threshold
ymin, xmin, ymax, xmax = result['detection_boxes'][0][i]
class_id = int(result['detection_classes'][0][i])
# Handle invalid class IDs (classes outside the VOC dataset)
if class_id < len(class_names):
label = class_names[class_id]
rect = patches.Rectangle((xmin * image.shape[1], ymin * image.shape[0]),
(xmax - xmin) * image.shape[1], (ymax - ymin) * image.shape[0],
linewidth=1, edgecolor='r', facecolor='none', label='Predicted')
plt.gca().add_patch(rect)
# Moved plt.text to the correct loop for the predicted box
plt.text(xmin * image.shape[1], ymin * image.shape[0] - 5, f'{label}: {score:.2f}', color='white', backgroundcolor='r')
plt.legend()
plt.show()
The detect_and_display function runs object detection on an image and displays the results, as you saw above. The function converts the image to the appropriate format, runs the detector, and then uses the helper function to display the results.
process_uploaded_image which processes an uploaded image for object detection. The function takes the raw image data as input, preprocesses the image, runs the object detection model, and then plots and prints the detected objects.
# take a few examples from the training set
for example in train_dataset.take(2): # Process 2 images
run_detector_and_visualize(example)
Process a few images from the dataset print("\nProcessing sample images from the dataset:") for i, example in enumerate(train_dataset.take(3)): print(f"\nSample image {i+1}") image = example['image'].numpy() detections = run_detector(detector, image) plot_detections(image, detections, class_names)
print("\nProcessing sample images from the dataset:")
for i, example in enumerate(train_dataset.take(3)):
print(f"\nSample image {i+1}")
run_detector_and_visualize(example)
Processing sample images from the dataset: Sample image 1
Sample image 2
Sample image 3
print("\nProcessing sample images from the dataset:")
# Iterate over the first three examples in the dataset
for i, example in enumerate(train_dataset.take(3)):
print(f"\nSample image {i+1}")
# Call the function that handles both detection and visualization
run_detector_and_visualize(example)
Processing sample images from the dataset: Sample image 1
Sample image 2
Sample image 3
The function called evaluate_model_performance which evaluates the performance of our object detection model on a dataset. The function takes three arguments: the dataset to evaluate on, the object detection model, and the number of images to use for evaluation. It calculates and prints the accuracy of the model based on the detections.
#Evaluate Model Performance
def evaluate_model_performance(dataset, detector, iou_threshold=0.5, num_samples=100):
true_positives = 0
false_positives = 0
false_negatives = 0
for example in dataset.take(num_samples):
image = example["image"].numpy()
gt_boxes = example["objects"]["bbox"].numpy()
gt_labels = example["objects"]["label"].numpy()
# Preprocess and run detection (same as before)
converted_img = tf.image.convert_image_dtype(image, tf.uint8)[tf.newaxis, ...]
result = detector(converted_img)
result = {key: value.numpy() for key, value in result.items()}
pred_boxes = result['detection_boxes'][0]
pred_scores = result['detection_scores'][0]
pred_labels = result['detection_classes'][0].astype(int)
# Iterate over predicted boxes
for i, score in enumerate(pred_scores):
if score < 0.5: # Confidence threshold
continue
# Convert box coordinates to [ymin, xmin, ymax, xmax]
pred_box = pred_boxes[i]
pred_box = [pred_box[1], pred_box[0], pred_box[3], pred_box[2]]
# Find matching ground truth box (if any) based on IoU
best_iou = 0
for j, gt_box in enumerate(gt_boxes):
iou = calculate_iou(gt_box, pred_box)
if iou > best_iou:
best_iou = iou
gt_index = j
# If IoU exceeds threshold, check class match
if best_iou > iou_threshold:
if pred_labels[i] == gt_labels[gt_index]:
true_positives += 1
else:
false_positives += 1
else:
false_positives += 1
# Count false negatives (missed ground truth boxes)
false_negatives += len(gt_boxes) - true_positives
precision = true_positives / (true_positives + false_positives) if true_positives + false_positives > 0 else 0
recall = true_positives / (true_positives + false_negatives) if true_positives + false_negatives > 0 else 0
print(f"Model Performance (IoU Threshold = {iou_threshold:.2f}):")
print(f"True Positives: {true_positives}")
print(f"False Positives: {false_positives}")
print(f"False Negatives: {false_negatives}")
print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
# (You'll need to implement a 'calculate_iou' function)
def calculate_iou(box1, box2):
"""Calculates the Intersection over Union (IoU) between two bounding boxes.
Args:
box1 (list): Coordinates of the first box in the format [ymin, xmin, ymax, xmax].
box2 (list): Coordinates of the second box in the same format.
Returns:
float: The IoU value (between 0 and 1).
"""
# 1. Calculate coordinates of the intersection rectangle
y1 = max(box1[0], box2[0])
x1 = max(box1[1], box2[1])
y2 = min(box1[2], box2[2])
x2 = min(box1[3], box2[3])
# 2. Calculate areas of the intersection and the union
intersection_area = max(0, y2 - y1) * max(0, x2 - x1)
box1_area = (box1[2] - box1[0]) * (box1[3] - box1[1])
box2_area = (box2[2] - box2[0]) * (box2[3] - box2[1])
union_area = box1_area + box2_area - intersection_area
# 3. Calculate IoU
if union_area == 0:
return 0 # Avoid division by zero
else:
iou = intersection_area / union_area
return iou
# Evaluate model performance
print("Evaluating model performance...")
evaluate_model_performance(validation_dataset, detector) # Use test data for evaluation
Evaluating model performance... Model Performance (IoU Threshold = 0.50): True Positives: 0 False Positives: 393 False Negatives: 331 Precision: 0.00 Recall: 0.00
Object detection models need to be evaluated on two fronts:
Classification Accuracy: Did the model correctly identify the object's class (e.g., person, car, bird)?
Localization Accuracy: Did the model accurately draw a bounding box around the object?
Our exercise focuses on assessing localization accuracy using the Intersection over Union (IoU) metric.
Understanding IoU (Intersection over Union)
IoU measures how much two bounding boxes overlap.
The iou_threshold in the code (default 0.5) means a predicted box is considered a "true positive" only if its IoU with a ground truth box is 0.5 or higher.
The function will print the following metrics:
Example Results: Let's say the output is:
Model Performance (IoU Threshold = 0.50): True Positives: 75 False Positives: 20 False Negatives: 15 Precision: 0.79 Recall: 0.83 Interpretation:
Recall is 0.83, meaning the model found 83% of the actual objects in the images.
Key Takeaways:
This final block allows you to input your own image URL for object detection, making the exercise interactive.
# Function to process uploaded images (for Google Colab)
def process_uploaded_image(image_data):
"""Processes and displays detections for an uploaded image."""
image = Image.open(BytesIO(image_data))
image_np = np.array(image) # Convert PIL Image to NumPy array
detections = run_detector(detector, image_np)
plot_detections_with_heatmap(image_np, detections, class_names)
# Print detected objects (example)
print("Detected objects:")
for i, score in enumerate(detections['detection_scores'][0]):
if score > 0.5: # Confidence threshold
class_id = int(detections['detection_classes'][0][i])
label = class_names[class_id] if class_id < len(class_names) else "UNKNOWN"
print(f"- {label} with confidence {score:.2f}")
# Instructions for image uploading (if in Google Colab)
print("\nTo upload your own image for object detection:")
print("1. If using Google Colab, use:")
print(" from google.colab import files")
print(" uploaded = files.upload()")
print(" image_data = next(iter(uploaded.values()))")
print("2. Then run:")
print(" process_uploaded_image(image_data)")
from google.colab import files
uploaded = files.upload()
image_data = next(iter(uploaded.values()))
process_uploaded_image(image_data)
To upload your own image for object detection: 1. If using Google Colab, use: from google.colab import files uploaded = files.upload() image_data = next(iter(uploaded.values())) 2. Then run: process_uploaded_image(image_data)
Saving GTR.PNG to GTR.PNG
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-12-60b598ea79d6> in <cell line: 30>() 28 image_data = next(iter(uploaded.values())) 29 ---> 30 process_uploaded_image(image_data) <ipython-input-12-60b598ea79d6> in process_uploaded_image(image_data) 4 image = Image.open(BytesIO(image_data)) 5 image_np = np.array(image) # Convert PIL Image to NumPy array ----> 6 detections = run_detector(detector, image_np) 7 plot_detections_with_heatmap(image_np, detections, class_names) 8 NameError: name 'run_detector' is not defined
# Function to process uploaded images (for Google Colab)
from PIL import Image
from io import BytesIO
import numpy as np
import tensorflow as tf
# Add the run_detector function definition
def run_detector(detector, image_np):
"""Runs the object detector on the input image."""
input_tensor = tf.convert_to_tensor(image_np)
input_tensor = input_tensor[tf.newaxis, ...]
detections = detector(input_tensor)
num_detections = int(detections.pop('num_detections'))
detections = {key: value[0, :num_detections].numpy()
for key, value in detections.items()}
detections['num_detections'] = num_detections
detections['detection_classes'] = detections['detection_classes'].astype(np.int64)
return detections
def process_uploaded_image(image_data):
"""Processes and displays detections for an uploaded image."""
image = Image.open(BytesIO(image_data))
image_np = np.array(image) # Convert PIL Image to NumPy array
detections = run_detector(detector, image_np) #Call the run_detector function
plot_detections_with_heatmap(image_np, detections, class_names) #Make sure plot_detections_with_heatmap is defined or imported
# Print detected objects (example)
print("Detected objects:")
for i, score in enumerate(detections['detection_scores'][0]):
if score > 0.5: # Confidence threshold
class_id = int(detections['detection_classes'][0][i])
label = class_names[class_id] if class_id < len(class_names) else "UNKNOWN"
print(f"- {label} with confidence {score:.2f}")
# Instructions for image uploading (if in Google Colab)
print("\nTo upload your own image for object detection:")
print("1. If using Google Colab, use:")
print(" from google.colab import files")
print(" uploaded = files.upload()")
print(" image_data = next(iter(uploaded.values()))")
print("2. Then run:")
print(" process_uploaded_image(image_data)")
from google.colab import files
uploaded = files.upload()
image_data = next(iter(uploaded.values()))
process_uploaded_image(image_data)
To upload your own image for object detection: 1. If using Google Colab, use: from google.colab import files uploaded = files.upload() image_data = next(iter(uploaded.values())) 2. Then run: process_uploaded_image(image_data)
Saving GTR.PNG to GTR (1).PNG
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-13-cfacdd671101> in <cell line: 47>() 45 image_data = next(iter(uploaded.values())) 46 ---> 47 process_uploaded_image(image_data) <ipython-input-13-cfacdd671101> in process_uploaded_image(image_data) 21 image = Image.open(BytesIO(image_data)) 22 image_np = np.array(image) # Convert PIL Image to NumPy array ---> 23 detections = run_detector(detector, image_np) #Call the run_detector function 24 plot_detections_with_heatmap(image_np, detections, class_names) #Make sure plot_detections_with_heatmap is defined or imported 25 <ipython-input-13-cfacdd671101> in run_detector(detector, image_np) 9 input_tensor = tf.convert_to_tensor(image_np) 10 input_tensor = input_tensor[tf.newaxis, ...] ---> 11 detections = detector(input_tensor) 12 num_detections = int(detections.pop('num_detections')) 13 detections = {key: value[0, :num_detections].numpy() /usr/local/lib/python3.10/dist-packages/tensorflow/python/saved_model/load.py in _call_attribute(instance, *args, **kwargs) 814 815 def _call_attribute(instance, *args, **kwargs): --> 816 return instance.__call__(*args, **kwargs) 817 818 /usr/local/lib/python3.10/dist-packages/tensorflow/python/util/traceback_utils.py in error_handler(*args, **kwargs) 151 except Exception as e: 152 filtered_tb = _process_traceback_frames(e.__traceback__) --> 153 raise e.with_traceback(filtered_tb) from None 154 finally: 155 del filtered_tb /usr/local/lib/python3.10/dist-packages/tensorflow/python/eager/polymorphic_function/function_type_utils.py in bind_function_inputs(args, kwargs, function_type, default_values) 444 ) 445 except Exception as e: --> 446 raise TypeError( 447 f"Binding inputs to tf.function failed due to `{e}`. " 448 f"Received args: {args} and kwargs: {sanitized_kwargs} for signature:" TypeError: Binding inputs to tf.function failed due to `Can not cast TensorSpec(shape=(1, 342, 762, 4), dtype=tf.uint8, name=None) to TensorSpec(shape=(1, None, None, 3), dtype=tf.uint8, name=None)`. Received args: (<tf.Tensor: shape=(1, 342, 762, 4), dtype=uint8, numpy= array([[[[ 52, 68, 95, 255], [ 51, 67, 93, 255], [ 50, 66, 92, 255], ..., [ 47, 72, 105, 255], [ 48, 73, 106, 255], [ 47, 72, 105, 255]], [[ 51, 67, 94, 255], [ 50, 66, 92, 255], [ 49, 65, 91, 255], ..., [ 48, 73, 105, 255], [ 49, 74, 106, 255], [ 48, 73, 105, 255]], [[ 52, 68, 95, 255], [ 50, 66, 92, 255], [ 49, 65, 91, 255], ..., [ 48, 74, 106, 255], [ 49, 75, 106, 255], [ 48, 74, 106, 255]], ..., [[ 77, 73, 77, 255], [ 75, 70, 74, 255], [ 75, 71, 75, 255], ..., [ 84, 82, 83, 255], [ 75, 72, 73, 255], [ 83, 80, 81, 255]], [[ 73, 70, 73, 255], [ 76, 72, 76, 255], [ 72, 68, 71, 255], ..., [ 91, 89, 89, 255], [ 83, 81, 81, 255], [ 83, 80, 80, 255]], [[ 74, 72, 75, 255], [ 80, 78, 81, 255], [ 78, 75, 78, 255], ..., [ 83, 81, 80, 255], [ 75, 72, 72, 255], [ 81, 79, 80, 255]]]], dtype=uint8)>,) and kwargs: {} for signature: (input_tensor: TensorSpec(shape=(1, None, None, 3), dtype=tf.uint8, name=None)).
from google.colab import files
# Upload files interactively in Google Colab
uploaded = files.upload()
# Extract the image data from the uploaded file
image_data = next(iter(uploaded.values()))
# Process the uploaded image
process_uploaded_image(image_data)
Saving evo-x.avif to evo-x.avif
--------------------------------------------------------------------------- UnidentifiedImageError Traceback (most recent call last) <ipython-input-14-6718a09a9413> in <cell line: 10>() 8 9 # Process the uploaded image ---> 10 process_uploaded_image(image_data) <ipython-input-13-cfacdd671101> in process_uploaded_image(image_data) 19 def process_uploaded_image(image_data): 20 """Processes and displays detections for an uploaded image.""" ---> 21 image = Image.open(BytesIO(image_data)) 22 image_np = np.array(image) # Convert PIL Image to NumPy array 23 detections = run_detector(detector, image_np) #Call the run_detector function /usr/local/lib/python3.10/dist-packages/PIL/Image.py in open(fp, mode, formats) 3534 warnings.warn(message) 3535 msg = "cannot identify image file %r" % (filename if filename else fp) -> 3536 raise UnidentifiedImageError(msg) 3537 3538 UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7f08f03aec50>
# Function to process uploaded images (for Google Colab)
def process_uploaded_image(image_data):
"""Processes and displays detections for an uploaded image."""
image = Image.open(BytesIO(image_data))
image_np = np.array(image) # Convert PIL Image to NumPy array
detections = run_detector(detector, image_np)
# Assuming you have a visualization function like this (modify as needed):
# Replace with your actual visualization function
def plot_detections_with_heatmap(image_np, detections, class_names):
"""Plots detections with a heatmap (implementation not provided)."""
# Add your plotting logic here
print("Visualization with heatmap would be here.")
plot_detections_with_heatmap(image_np, detections, class_names)
# Print detected objects (example)
print("Detected objects:")
for i, score in enumerate(detections['detection_scores'][0]):
if score > 0.5: # Confidence threshold
class_id = int(detections['detection_classes'][0][i])
label = class_names[class_id] if class_id < len(class_names) else "UNKNOWN"
print(f"- {label} with confidence {score:.2f}")
# Instructions for image uploading (if in Google Colab)
print("\nTo upload your own image for object detection:")
print("1. If using Google Colab, use:")
print(" from google.colab import files")
print(" uploaded = files.upload()")
print(" image_data = next(iter(uploaded.values()))")
print("2. Then run:")
print(" process_uploaded_image(image_data)")
from google.colab import files
uploaded = files.upload()
image_data = next(iter(uploaded.values()))
process_uploaded_image(image_data)
To upload your own image for object detection: 1. If using Google Colab, use: from google.colab import files uploaded = files.upload() image_data = next(iter(uploaded.values())) 2. Then run: process_uploaded_image(image_data)
Saving GTR.PNG to GTR (2).PNG
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-15-40d803eaf24c> in <cell line: 39>() 37 image_data = next(iter(uploaded.values())) 38 ---> 39 process_uploaded_image(image_data) <ipython-input-15-40d803eaf24c> in process_uploaded_image(image_data) 4 image = Image.open(BytesIO(image_data)) 5 image_np = np.array(image) # Convert PIL Image to NumPy array ----> 6 detections = run_detector(detector, image_np) 7 8 # Assuming you have a visualization function like this (modify as needed): <ipython-input-13-cfacdd671101> in run_detector(detector, image_np) 9 input_tensor = tf.convert_to_tensor(image_np) 10 input_tensor = input_tensor[tf.newaxis, ...] ---> 11 detections = detector(input_tensor) 12 num_detections = int(detections.pop('num_detections')) 13 detections = {key: value[0, :num_detections].numpy() /usr/local/lib/python3.10/dist-packages/tensorflow/python/saved_model/load.py in _call_attribute(instance, *args, **kwargs) 814 815 def _call_attribute(instance, *args, **kwargs): --> 816 return instance.__call__(*args, **kwargs) 817 818 /usr/local/lib/python3.10/dist-packages/tensorflow/python/util/traceback_utils.py in error_handler(*args, **kwargs) 151 except Exception as e: 152 filtered_tb = _process_traceback_frames(e.__traceback__) --> 153 raise e.with_traceback(filtered_tb) from None 154 finally: 155 del filtered_tb /usr/local/lib/python3.10/dist-packages/tensorflow/python/eager/polymorphic_function/function_type_utils.py in bind_function_inputs(args, kwargs, function_type, default_values) 444 ) 445 except Exception as e: --> 446 raise TypeError( 447 f"Binding inputs to tf.function failed due to `{e}`. " 448 f"Received args: {args} and kwargs: {sanitized_kwargs} for signature:" TypeError: Binding inputs to tf.function failed due to `Can not cast TensorSpec(shape=(1, 342, 762, 4), dtype=tf.uint8, name=None) to TensorSpec(shape=(1, None, None, 3), dtype=tf.uint8, name=None)`. Received args: (<tf.Tensor: shape=(1, 342, 762, 4), dtype=uint8, numpy= array([[[[ 52, 68, 95, 255], [ 51, 67, 93, 255], [ 50, 66, 92, 255], ..., [ 47, 72, 105, 255], [ 48, 73, 106, 255], [ 47, 72, 105, 255]], [[ 51, 67, 94, 255], [ 50, 66, 92, 255], [ 49, 65, 91, 255], ..., [ 48, 73, 105, 255], [ 49, 74, 106, 255], [ 48, 73, 105, 255]], [[ 52, 68, 95, 255], [ 50, 66, 92, 255], [ 49, 65, 91, 255], ..., [ 48, 74, 106, 255], [ 49, 75, 106, 255], [ 48, 74, 106, 255]], ..., [[ 77, 73, 77, 255], [ 75, 70, 74, 255], [ 75, 71, 75, 255], ..., [ 84, 82, 83, 255], [ 75, 72, 73, 255], [ 83, 80, 81, 255]], [[ 73, 70, 73, 255], [ 76, 72, 76, 255], [ 72, 68, 71, 255], ..., [ 91, 89, 89, 255], [ 83, 81, 81, 255], [ 83, 80, 80, 255]], [[ 74, 72, 75, 255], [ 80, 78, 81, 255], [ 78, 75, 78, 255], ..., [ 83, 81, 80, 255], [ 75, 72, 72, 255], [ 81, 79, 80, 255]]]], dtype=uint8)>,) and kwargs: {} for signature: (input_tensor: TensorSpec(shape=(1, None, None, 3), dtype=tf.uint8, name=None)).
from PIL import Image
import numpy as np
from io import BytesIO
# Define function to process uploaded image data
def process_uploaded_image(image_data):
"""Processes and displays detections for an uploaded image."""
# Load image from binary data and convert to NumPy array
image = Image.open(BytesIO(image_data))
image_np = np.array(image) # Convert PIL Image to NumPy array
# Run the object detector on the image
detections = run_detector(detector, image_np)
# Plot image with detections and overlay heatmap (if applicable)
plot_detections_with_heatmap(image_np, detections, class_names)
# Print detected objects with their confidence scores
print("Detected objects:")
for i, score in enumerate(detections['detection_scores'][0]):
if score > 0.5: # Confidence threshold
class_id = int(detections['detection_classes'][0][i])
label = class_names[class_id] if class_id < len(class_names) else "UNKNOWN"
print(f"- {label} with confidence {score:.2f}")
# Instructions for using this in Google Colab
print("\nTo upload your own image for object detection:")
print("1. If using Google Colab, use:")
print(" from google.colab import files")
print(" uploaded = files.upload()")
print(" image_data = next(iter(uploaded.values()))")
print("2. Then run:")
print(" process_uploaded_image(image_data)")
from google.colab import files
# Trigger the file upload dialog
uploaded = files.upload()
# Extract the image data from the uploaded files
image_data = next(iter(uploaded.values()))
# Process the uploaded image
process_uploaded_image(image_data)
To upload your own image for object detection: 1. If using Google Colab, use: from google.colab import files uploaded = files.upload() image_data = next(iter(uploaded.values())) 2. Then run: process_uploaded_image(image_data)
Saving GTR.PNG to GTR (3).PNG
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-16-cec44a385083> in <cell line: 45>() 43 44 # Process the uploaded image ---> 45 process_uploaded_image(image_data) <ipython-input-16-cec44a385083> in process_uploaded_image(image_data) 12 13 # Run the object detector on the image ---> 14 detections = run_detector(detector, image_np) 15 16 # Plot image with detections and overlay heatmap (if applicable) <ipython-input-13-cfacdd671101> in run_detector(detector, image_np) 9 input_tensor = tf.convert_to_tensor(image_np) 10 input_tensor = input_tensor[tf.newaxis, ...] ---> 11 detections = detector(input_tensor) 12 num_detections = int(detections.pop('num_detections')) 13 detections = {key: value[0, :num_detections].numpy() /usr/local/lib/python3.10/dist-packages/tensorflow/python/saved_model/load.py in _call_attribute(instance, *args, **kwargs) 814 815 def _call_attribute(instance, *args, **kwargs): --> 816 return instance.__call__(*args, **kwargs) 817 818 /usr/local/lib/python3.10/dist-packages/tensorflow/python/util/traceback_utils.py in error_handler(*args, **kwargs) 151 except Exception as e: 152 filtered_tb = _process_traceback_frames(e.__traceback__) --> 153 raise e.with_traceback(filtered_tb) from None 154 finally: 155 del filtered_tb /usr/local/lib/python3.10/dist-packages/tensorflow/python/eager/polymorphic_function/function_type_utils.py in bind_function_inputs(args, kwargs, function_type, default_values) 444 ) 445 except Exception as e: --> 446 raise TypeError( 447 f"Binding inputs to tf.function failed due to `{e}`. " 448 f"Received args: {args} and kwargs: {sanitized_kwargs} for signature:" TypeError: Binding inputs to tf.function failed due to `Can not cast TensorSpec(shape=(1, 342, 762, 4), dtype=tf.uint8, name=None) to TensorSpec(shape=(1, None, None, 3), dtype=tf.uint8, name=None)`. Received args: (<tf.Tensor: shape=(1, 342, 762, 4), dtype=uint8, numpy= array([[[[ 52, 68, 95, 255], [ 51, 67, 93, 255], [ 50, 66, 92, 255], ..., [ 47, 72, 105, 255], [ 48, 73, 106, 255], [ 47, 72, 105, 255]], [[ 51, 67, 94, 255], [ 50, 66, 92, 255], [ 49, 65, 91, 255], ..., [ 48, 73, 105, 255], [ 49, 74, 106, 255], [ 48, 73, 105, 255]], [[ 52, 68, 95, 255], [ 50, 66, 92, 255], [ 49, 65, 91, 255], ..., [ 48, 74, 106, 255], [ 49, 75, 106, 255], [ 48, 74, 106, 255]], ..., [[ 77, 73, 77, 255], [ 75, 70, 74, 255], [ 75, 71, 75, 255], ..., [ 84, 82, 83, 255], [ 75, 72, 73, 255], [ 83, 80, 81, 255]], [[ 73, 70, 73, 255], [ 76, 72, 76, 255], [ 72, 68, 71, 255], ..., [ 91, 89, 89, 255], [ 83, 81, 81, 255], [ 83, 80, 80, 255]], [[ 74, 72, 75, 255], [ 80, 78, 81, 255], [ 78, 75, 78, 255], ..., [ 83, 81, 80, 255], [ 75, 72, 72, 255], [ 81, 79, 80, 255]]]], dtype=uint8)>,) and kwargs: {} for signature: (input_tensor: TensorSpec(shape=(1, None, None, 3), dtype=tf.uint8, name=None)).
This exercise introduces you to object detection while keeping computational requirements relatively low. It uses a pre-trained model, so no training is required, making it suitable for systems with limited resources.
Using pre-trained models for complex tasks The basics of object detection (bounding boxes, class labels, confidence scores) Visualizing detection results Simple analysis of detection outputs
The exercise is also interactive, allowing students to try object detection on their own chosen images. Copy
My overall experience with this lab was enjoyable and I feel as if I have learned a lot, with that being said the lack of instructions made the lab far more frustrating than it should have been. The upload your own image section gave me lots of problems, see notes, and I feel as if most of the problems could have been solved with better instructions. This lab was a useful tool but could use better, clearer instructions to make it a more viable tool.